智能论文笔记

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Yusong Wu , Ethan Manilow , Yi Deng , Rigel Swavely , Kyle Kastner , Tim Cooijmans , Aaron Courville , Cheng-Zhi Anna Huang , Jesse Engel

分类：机器学习

2021-12-17

音乐表达需要控制播放的笔记，以及如何执行它们。传统的音频合成器提供了详细的表达控制，但以现实主义的成本提供了详细的表达控制。黑匣子神经音频合成和连接采样器可以产生现实的音频，但有很少的控制机制。在这项工作中，我们介绍MIDI-DDSP乐器的分层模型，可以实现现实的神经音频合成和详细的用户控制。从可解释的可分辨率数字信号处理（DDSP）合成参数开始，我们推断出富有表现力性能的音符和高级属性（例如Timbre，Vibrato，Dynamics和Asticiculation）。这将创建3级层次结构（注释，性能，合成），提供个人选择在每个级别进行干预，或利用培训的前沿（表现给出备注，综合赋予绩效）进行创造性的帮助。通过定量实验和聆听测试，我们证明了该层次结构可以重建高保真音频，准确地预测音符序列的性能属性，独立地操纵给定性能的属性，以及作为完整的系统，从新颖的音符生成现实音频顺序。通过利用可解释的层次结构，具有多个粒度的粒度，MIDI-DDSP将门打开辅助工具的门，以赋予各种音乐体验的个人。

translated by 谷歌翻译

Multi-surrogate Assisted Efficient Global Optimization for Discrete Problems

Qi Huang , Roy de Winter , Bas van Stein , Thomas Bäck , Anna V. Kononova

分类：神经与进化计算

2022-12-13

Decades of progress in simulation-based surrogate-assisted optimization and unprecedented growth in computational power have enabled researchers and practitioners to optimize previously intractable complex engineering problems. This paper investigates the possible benefit of a concurrent utilization of multiple simulation-based surrogate models to solve complex discrete optimization problems. To fulfill this, the so-called Self-Adaptive Multi-surrogate Assisted Efficient Global Optimization algorithm (SAMA-DiEGO), which features a two-stage online model management strategy, is proposed and further benchmarked on fifteen binary-encoded combinatorial and fifteen ordinal problems against several state-of-the-art non-surrogate or single surrogate assisted optimization algorithms. Our findings indicate that SAMA-DiEGO can rapidly converge to better solutions on a majority of the test problems, which shows the feasibility and advantage of using multiple surrogate models in optimizing discrete problems.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Alexa, Let's Work Together: Introducing the First Alexa Prize TaskBot Challenge on Conversational Task Assistance

Anna Gottardi , Osman Ipek , Giuseppe Castellucci , Shui Hu , Lavina Vaz , Yao Lu , Anju Khatri , Anjali Chadha , Desheng Zhang , Sattvik Sahai

分类：自然语言处理 | 人工智能

2022-09-13

自2016年成立以来，Alexa奖计划使数百名大学生能够通过Socialbot Grand Challenge探索和竞争以发展对话代理商。挑战的目的是建立能够与人类在流行主题上连贯而诱人的代理人20分钟，同时达到至少4.0/5.0的平均评分。但是，由于对话代理商试图帮助用户完成日益复杂的任务，因此需要新的对话AI技术和评估平台。成立于2021年的Alexa奖Taskbot Challenge建立在Socialbot Challenge的成功基础上，通过引入交互式协助人类进行现实世界烹饪和做自己动手做的任务的要求，同时同时使用语音和视觉方式。这项挑战要求TaskBots识别和理解用户的需求，识别和集成任务和域知识，并开发新的方式，不分散用户的注意力，而不必分散他们的任务，以及其他挑战。本文概述了Taskbot挑战赛，描述了使用Cobot Toolkit提供给团队提供的基础架构支持，并总结了参与团队以克服研究挑战所采取的方法。最后，它分析了比赛第一年的竞争任务机器人的性能。

translated by 谷歌翻译

Expressive Communication: A Common Framework for Evaluating Developments in Generative Models and Steering Interfaces

Ryan Louie , Jesse Engel , Anna Huang

分类：机器学习

2021-11-29

在具有更好的创造者中，ML和HCI社区的兴趣越来越兴趣，具有更好的生成模型和更直观的界面来控制它们。在音乐中，ML研究人员专注于能够通过增加远程结构和音乐相干性能产生碎片的培训模型，而HCI研究人员则单独关注设计支持用户控制和所有权的转向界面。在这项研究中，我们通过共同的框架来调查模型和用户界面的开发如何对赋予创建权力的重要性，其中目标是创建传播特定图像或想法的音乐（例如，对于音乐中的其他有目的任务而言。创作像建立情绪或为另一个媒体创造陪伴的音乐）。我们的研究区别于它通过作曲家的自我报告的经验来衡量沟通，听众如何通过音乐评估这种通信。在一个评估研究中，用26个作曲家创建100多件音乐和听众提供1000多头对比的比较，我们发现更多的表现力模型和更多的可操纵的接口是重要的和互补方法，可以在通过音乐通信的作曲家中实现差异支持他们的创造性赋权。

translated by 谷歌翻译

MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Alexandros Karargyris , Renato Umeton , Micah J. Sheller , Alejandro Aristizabal , Johnu George , Srini Bala , Daniel J. Beutel , Victor Bittorf , Akshay Chaudhari , Alexander Chowdhury

分类：机器学习

2021-09-29

医疗AI通过支持基于证据的医学实践，个性化患者治疗，降低成本以及改善提供者和患者体验，推进医疗保健的巨大潜力。我们认为解锁此潜力需要一种系统的方法来衡量在大规模异构数据上的医疗AI模型的性能。为了满足这种需求，我们正在建立Medperf，这是一个开放的框架，用于在医疗领域的基准测试机器学习。 Medperf将使联合评估能够将模型安全地分配给不同的评估设施，从而赋予医疗组织在高效和人类监督过程中评估和验证AI模型的性能，同时优先考虑隐私。我们描述了当前的挑战医疗保健和AI社区面临，需要开放平台，Medperf的设计理念，其目前的实施状态和我们的路线图。我们呼吁研究人员和组织加入我们创建Medperf开放基准平台。

translated by 谷歌翻译

Genetic-tunneling driven energy optimizer for magnetic system

Qichen Xu , Zhuanglin Shen , Manuel Pereiro , Pawel Herman , Olle Eriksson , Anna Delin

分类：神经与进化计算

2022-12-31

Novel topological spin textures, such as magnetic skyrmions, benefit from their inherent stability, acting as the ground state in several magnetic systems. In the current study of atomic monolayer magnetic materials, reasonable initial guesses are still needed to search for those magnetic patterns. This situation underlines the need to develop a more effective way to identify the ground states. To solve this problem, in this work, we propose a genetic-tunneling-driven variance-controlled optimization approach, which combines a local energy minimizer back-end and a metaheuristic global searching front-end. This algorithm is an effective optimization solution for searching for magnetic ground states at extremely low temperatures and is also robust for finding low-energy degenerated states at finite temperatures. We demonstrate here the success of this method in searching for magnetic ground states of 2D monolayer systems with both artificial and calculated interactions from density functional theory. It is also worth noting that the inherent concurrent property of this algorithm can significantly decrease the execution time. In conclusion, our proposed method builds a useful tool for low-dimensional magnetic system energy optimization.

translated by 谷歌翻译

ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports

Katharina Jeblick , Balthasar Schachtner , Jakob Dexl , Andreas Mittermeier , Anna Theresa Stüber , Johanna Topalis , Tobias Weber , Philipp Wesp , Bastian Sabel , Jens Ricke

分类：自然语言处理 | 机器学习

2022-12-30

The release of ChatGPT, a language model capable of generating text that appears human-like and authentic, has gained significant attention beyond the research community. We expect that the convincing performance of ChatGPT incentivizes users to apply it to a variety of downstream tasks, including prompting the model to simplify their own medical reports. To investigate this phenomenon, we conducted an exploratory case study. In a questionnaire, we asked 15 radiologists to assess the quality of radiology reports simplified by ChatGPT. Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed key medical findings, and potentially harmful passages were reported. While further studies are needed, the initial insights of this study indicate a great potential in using large language models like ChatGPT to improve patient-centered care in radiology and other medical domains.

translated by 谷歌翻译

Non-intrusive surrogate modelling using sparse random features with applications in crashworthiness analysis

Maternus Herold , Anna Veselovska , Jonas Jehle , Felix Krahmer

分类：机器学习 | (统计)机器学习

2022-12-30

Efficient surrogate modelling is a key requirement for uncertainty quantification in data-driven scenarios. In this work, a novel approach of using Sparse Random Features for surrogate modelling in combination with self-supervised dimensionality reduction is described. The method is compared to other methods on synthetic and real data obtained from crashworthiness analyses. The results show a superiority of the here described approach over state of the art surrogate modelling techniques, Polynomial Chaos Expansions and Neural Networks.

translated by 谷歌翻译

TAToo: Vision-based Joint Tracking of Anatomy and Tool for Skull-base Surgery

Zhaoshuo Li , Hongchao Shu , Ruixing Liang , Anna Goodridge , Manish Sahu , Francis X. Creighton , Russell H. Taylor , Mathias Unberath

分类：计算机视觉 | 人工智能

2022-12-29

Purpose: Tracking the 3D motion of the surgical tool and the patient anatomy is a fundamental requirement for computer-assisted skull-base surgery. The estimated motion can be used both for intra-operative guidance and for downstream skill analysis. Recovering such motion solely from surgical videos is desirable, as it is compliant with current clinical workflows and instrumentation. Methods: We present Tracker of Anatomy and Tool (TAToo). TAToo jointly tracks the rigid 3D motion of patient skull and surgical drill from stereo microscopic videos. TAToo estimates motion via an iterative optimization process in an end-to-end differentiable form. For robust tracking performance, TAToo adopts a probabilistic formulation and enforces geometric constraints on the object level. Results: We validate TAToo on both simulation data, where ground truth motion is available, as well as on anthropomorphic phantom data, where optical tracking provides a strong baseline. We report sub-millimeter and millimeter inter-frame tracking accuracy for skull and drill, respectively, with rotation errors below 1{\deg}. We further illustrate how TAToo may be used in a surgical navigation setting. Conclusion: We present TAToo, which simultaneously tracks the surgical tool and the patient anatomy in skull-base surgery. TAToo directly predicts the motion from surgical videos, without the need of any markers. Our results show that the performance of TAToo compares favorably to competing approaches. Future work will include fine-tuning of our depth network to reach a 1 mm clinical accuracy goal desired for surgical applications in the skull base.

translated by 谷歌翻译